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Abstract. We present a PTAS for computing the maximum a poste¬ 
riori assignment on Pairwise Markov Random Fields with non-negative 
weights in planar graphs. This algorithm is practical and not far be¬ 
hind state-of-the-art techniques in image processing. MAP on Pairwise 
Markov Random Fields with (possibly) negative weights cannot be ap¬ 
proximated unless P = NP, even on planar graphs. We also show via 
reduction that this yields a PTAS for one scoring function of Correlation 
Clustering in planar graphs. 


1 Introduction 


Pairwise Markov Random Fields (MRFs) model distributions in a variety of ap¬ 
plications and arise in fields as diverse as statistical physics, computer vision, 
coding theory, computational biology, machine learning, and combinatorial op¬ 
timization. Solving associated optimization problems is critical in practice and 
also of high theoretical importance. 

We briefly review the statistical view on MRFs before focusing on the combi¬ 
natorial problem. A pairwise MRF is a set of n random variables X = {Xi,..., A„} 
over label set L}, a graph G = (X, E), where 


Pr[X = a;] = 2 exp I ^ + X > 

b.i)e£ / 

where (j)i and 'ipij are arbitrary functions and Z is a normalizing constant. Intu¬ 
itively, 4>i{xi) can be regarded as vertex i’s preference for label Xi and ipij{xi, Xj) 
as the compatibility between labels Xi and Xj on the endpoints of edge ij. 
We are interested in finding a maximum a posteriori (MAP) assignment x*, 
i.e. X* = argmax 3 ,Pr[X = x\. Finding the MAP label assignment corresponds 
to this optimization problem: 



Pairwise MAP MRF 

Instance: 

— graphG = (P,A), 

— label set £ = {1,, L}, 

— singleton functions ^i(-) : £ —^ K V z S P, 

— pairwise functions V'ij(’) ^ ^ V (£ j) G £. 

Solution: for each v G V, label assignment Xy G C 
Maximize: 

^ ^ ^ ^ '4^uv{.Xu-/Xy') ^ 

v^V {u,v)^E 

Throughout the paper, we will assume G to be connected; combining the solu- 
tions on each component handles the case of a disconnected graph. 

Pairwise MAP MRF has been considered in many domains, however, in 
general graphs we will show: 

Theorem 1. There is ana > 0 such that, unless P = NP, there is no polynomial¬ 
time a-approximation algorithm for Pairwise MAP MRF, even for nonnega¬ 
tive (j) and Ip. 

In light of this, we focus on planar graphs as many real-world instances, 
such as those from computer vision, are planar or nearly planar. It turns out 
that Pairwise MAP MRF is still NP-hard on planar graphs [5]. However, 
restricting our attention to planar graphs allows for much better approximation 
algorithms. 

We will additionally require (pi and i/jy to be nonnegative. By setting 
(p'i{x) = (pi{x) — min(^i(a)) 'd i gV,x G C and 

aGC 

= ipij{x,y) - min (tp,j{a,b)) V (z,j) G E, V x,y G C, 

a,b^C 

we can transform an instance with general weights into an instance with non¬ 
negative weights with the same optimal assignment. However, this changes the 
value of the objective function, and thus also the approximation ratio. This 
restriction is necessary, as with general weights Pairwise MAP MRF is impos¬ 
sible to approximate unless P = NP. In particular: 

Theorem 2. The existence of an algorithm approximating Pairwise MAP 
MRF on planar graphs with maximum degree 4 o,nd nonpositive (pi and ipij 
to any multiplicative factor implies P = NP. 

In many applications, MRF is used to minimize an energy function. Notice 
that this is equivalent to maximizing the negative energy function. Thus Theo¬ 
rem [^implies minimization is inapproximable to any multiplicative factor, even 
if the energy function is nonnegative. 

A polynomial-time approximation scheme (PTAS) is an algorithm that, given 
an instance of a maximization (minimization) problem and a precision param¬ 
eter 0 < e < 1, returns a (1 — e)-approximate ((1 -|- e)-approximate, resp.) 



solution in time polynomial in the size of the instance (with a possible exponen¬ 
tial dependence on 1/e). An efficient PTAS (EPTAS) is one with runtime of the 
form 0(/(e) poly(n)), where n is the size of the instance and / is a computable 
function. 

Our main result is: 

Theorem 3. There is a PTAS for Pairwise MAP MRF in planar graphs 
when all (f and ip are nonnegative functions. 

We also consider the closely related Correlation Clustering problem. 
In this, one is given a graph and tasked with partitioning the vertices into an 
arbitrary number of clusters. The edges have associated rewards and preferences 
as to whether their endpoints should or should not be in the same cluster; the 
objective function is the sum of the weights of the edges whose preferences are 
satisfied. Correlation Clustering is sometimes expressed with a penalty for 
unsatisfied edges in addition to, or instead of, a reward for satisfied edges. These 
formulations all have the same optimal solution, but as in Pairwise MAP MRF, 
the value of the objective function changes, and thus approximation results may 
differ as well. 

Formally, the version we will address is: 

Correlation Clustering 

Instance: 

— graph G = (y,F;), 

— edge preferences p : E ^ {0,1}, 

— edge reward function w : E ^ R>o- 

Solution: a partition of the vertices into clusters. 

Maximize: 

w{u,v) [{1 - p{u,v))Ciu,v) + p{u,v){l - G(m,'u))] 

{u,v)GE 

where C{u, v) is 1 if rt and v belong to the same cluster and 0 otherwise. 

Via a simple reduction to Pairwise MAP MRF: 

Corollary 1. There is an EPTAS for Correlation Clustering in planar 
graphs. 


1.1 Outline 


In Sectionj^ we review past work on Pairwise MAP MRF. Next, in Sectionj^ 
we give an exact algorithm for graphs of bounded branchwidth. Then, Section 4 
proves Theorem In the interest of space, proofs of Corollary and Theorems 1 
and are in the appendix. We demonstrate some promising experimental results 
in Section with applications to computer vision. Finally, we offer discussion in 
Section |6l 





2 Prior Work 


Markov Random Fields originated in statistical physics as a generalization of 
the Ising Model m- There are numerous techniques to solve Pairwise MAP 
MRF, both in general and on specific instances; some are outlined here. 

An MRF is binary if there are exactly two labels and submodular if for 
all u,v gV, for all i,j S {1,.. .,L}, tpu,v{i,-i)+'ipu,v{j,j) > !/'«.«(*, + 

If an MRF is both binary and submodular, Pairwise MAP MRF can be 
solved exactly in polynomial time by reduction to Min-Cut If the graph is 
also planar, the running time can be improved to 0(nlog(n)) |20| . 

For MRF in graphs which have bounded degree and an excluded minor (which 
includes all bounded degree planar graphs) Jung and Shah use techniques similar 
to ours to find a PTAS with running time doubly exponential in 1/e |12j . For the 
alternate formulation of Correlation Clustering which seeks to minimize 
penalties for unsatisfied edges, Klein et al. demonstrate a non-efficient PTAS in 
planar graphs m- 

When 'ipij are defined by a metric on the labels, the problem is referred to as 
Metric Labeling; [T5] provides a 0(logLloglogL)-approximation algorithm 
for the problem. 

The Generalized Potts Model, from statistical mechanics, is a restric¬ 
tion of MRF that reduces to the classic Multiway Cut problem; j5] uses local 
search to approximate this model. Multiway Cut is a special case of Metric 
Labeling, where some vertices are forced to have particular labels. In planar 
graphs, there is a PTAS for the problem [3]. In general, there are constant-factor 
approximations [5]. 

0-Extension is a generalization of Multiway Cut in which the cost of the 
edge depends on the specific terminals associated with the edge’s endpoints, 
not just whether the terminals are the same. In general graphs, this prob¬ 
lem is 0(logL/loglogL)-approximable |14l7lllj and can be approximated to 
a constant-factor in planar graphs. 

Various heuristics exist to approximate MAP on planar graphs and are used 
extensively in computer vision for applications such as: 

— Stereo vision: given two photographs taken side-by-side, estimate the depths 

of each pixel. 

— Object segmentation: find the boundaries of objects in photographs. 

— De-noising: remove grainy noise from an image. 

— Photomontage: combine several images into one. 

Two standard benchmarks for these problems are OpenGM m and the Middle- 
bury stereo dataset [19] . For a detailed treatment of MRF as applied to computer 
vision, see, e.g., [24] . 

Many problems, including Pairwise MAP MRF and more traditional op¬ 
timization problems such as TSP, Steiner Tree, Vertex Cover, Graph 
Coloring, Clique, Hamiltonian path, and Feedback Vertex Set can be 
solved exactly in polynomial time on graphs of bounded branchwidth. Branch- 
width, like treewidth, pathwidth, bandwidth, outerplanarity, or cliquewidth, is a 




measure of the “simplicity” of a graph. These measures are amenable to dynamic 
programming and have been of great importance when designing approximation 
schemes on planar graphs |lll()llbl4| . 

Our algorithm draws inspiration from Baker’s technique [1] , a powerful frame¬ 
work for building PTASes in planar graphs. In a nutshell, Baker guesses a way 
to decompose a graph into a number of smaller graphs of bounded outerpla- 
narity. These smaller graphs are each solved optimally and independently, and 
then combining the solutions incurs at most e OPT error. This technique was 
originally applied to Independent Set but can be used for a number of prob¬ 
lems, such as Vertex Cover, Edge-Disjoint Triangles, and Dominating 
Set [T]. 

Recently, Wang posted a manuscript on arXiv claiming a PTAS for Pair- 
wise MAP MRF on planar graphs, among other results [23]. We remark that 
our main result. Theorem]^ was discovered independently. Theoremdraws in¬ 
spiration from and strengthens a hardness proof of Wang. Unfortunately, there 
appears to be a bug in an vital lemma in [^. We discuss this in Appendix [b| 


3 Pairwise MAP MRF in Bounded Branchwidth Graphs 

A branch decomposition of a graph G = {V, E) is an unrooted binary tree T 
whose leaves are the edges E of G. Deleting an edge of T generates two subgraphs 
of G, each induced by the edges in one component of T. Some vertices are 
contained in both subgraphs. The maximum number of these overlapping vertices 
for any such pair of subgraphs is the width of the decomposition. The minimum 
width of any branch decomposition of G is its branchwidth. 

Our PTAS is an application of Baker’s technique [1], and works by breaking 
up the problem into bounded branchwidth subproblems, each of which can be 
solved exactly in polynomial time. 

Theorem 4. Given an Pairwise MAP MRF instance (G = {V, E), L, (j),ip) 
and a branch decomposition T of width k, an optimal solution can be found in 
time 0{\E\kL'^'^). 

Proof. We use dynamic programming. T will guide the dynamic program, and 
thus we want a root with two children. To that end, we choose an arbitrary edge 
of T and subdivide it with a new vertex r that we designate the root. Now T is a 
rooted binary tree but maintains the other properties of a branch decomposition. 

With each tree vertex v G T, let G{v) be the subgraph of G induced by 
the edges of G which are descendants of v. Observe that G(r) = G. Denote 
by 6{G{v)) the vertices of G{v) which are incident to edges not in G{v). Note 
that |(5 (G(t))| < k for all v G T. 

For each vertex v G T, we will compute the assignment to the vertices V{G{v))— 
S(G(v)) for each possible assignment to the vertices 6{G{v)) which maximizes 
the score of the MRF on G{v). This is done bottom-up, so that for all non-leaf 
vertices of T, assignments for both of their children are computed first. 




If u is a leaf, G{v) is a single edge with its endpoints. Thus either V{G{v)) — 
S{G(v)) is empty and finding the optimal assignment is trivial; or V{G{v)) — 
6{G{v)) is a single endpoint, and all possible label assignments can be tested. 
In both cases, it takes 0{LF‘) time to test for all possible boundary assignments 
what the best assignment to t^(G(u)) — S(G(v)) is. 

If V is not a leaf, it has two children ui,U2. Let U = S{G{ui)) U S{G{u 2 )) 
and / = S{G{ui)) n 6 {G{u 2 )). Notice S(G(v)) C U. For each label assignment 
to the vertices of S{G{v)), the best assignment to V{G{v)) — S{G{v)) is the 
union of best assignments to V{G{ui)) — S{G{ui)) and V{G{ui)) — 6 {G{ui)) for 
some assignment to / — 6 {G{v)), and its value is the sum of the values of those 
assignments minus the values of (j) on I. As those assignments and values have 
already been computed, finding the optimal ones can be done in time 
Since |/| < k and \U\ < 2k, computing all the assignments and values at vertex v 
takes time 

S{G{r)) is empty, so the unique assignment and value computed at r are 
the exact optimal solution to the Pairwise MAP MRF instance. The rooted 
branch decomposition has 2|A| — I vertices, thus the running time is 0(| 

□ 


We summarize the algorithm: 


1. Choose an arbitrary edge e of T, and subdivide it with a new root 
vertex r. 

2. With each vertex u of T associate the subgraph G{v) of G induced by 
the edges of G which are descendants of v (with respect to the root r). 

3. Consider each vertex u of T from leaf to root: 

(a) If V is a leaf, for each possible label assignment to the vertices of 
S{G{v)), by brute force, compute the best assignment to P(G(u)) \ 
S{Giv)). 

(b) Otherwise, for each possible label assignment to the vertices of 
S{G{v)), combine the values and assignments of u’s two children 
to determine the best assignment to V{G{v)) \ S{G{v)). 

4. Return the best assignment for G(r) = G. 


4 PTAS for Pairwise MAP MRF on Planar Graphs 


We now give the PTAS for our main result. As input, we are given an instance 
of Pairwise MAP MRF (G = {V,E),L,4>,'ip) where G is a planar graph, and 
a desired error parameter 0 < e < I, with fc = y- 

Fix some vertex r. We say an edge has r-level d if one of its endpoints is 
hop-distance d — \ from r and the other is hop-distance d. Let Gj be the graph 
resulting in deleting all edges with r-levels congruent to j (mod k). 

The algorithm is: 




1. Choose a vertex r arbitrarily. 

2. Let fc = j- 

3. For each j € {0,..., A: — 1}: 

(a) Compute Gj. 

(b) Find an approximate branch decomposition T of each component 
of Gj using the algorithm in m- 

(c) Apply Theorem 1^ to each component of Gj and combine the result¬ 
ing best label assignments into Xj. 

(d) Compute the value hj of the objective function on G from Xj. 

4. Return the assignment corresponding to the largest hj. 

With this, we are ready to prove our main result. 

Proof (of Theorem^. First, we tackle the runtime. For each j, it takes linear 
time to construct Gj by building a breadth first search tree from r. 

By construction, there exists in each component of Gj a path of length at 
most k from each vertex to a vertex on the face containing r. An algorithm by 
Tamaki m allows us to construct a branch decomposition of width at most 
2k on a graph with this property in time 0{mi2^^), where rrii is the number of 
edges in the component. 

Then, solving these optimally usingand combining takes time 0{\E\kL‘^^). 
As we try k different choices of j, the total running time is This 

is linear in the size of the graph, as fc is a function of e. However, as L is part of 
the input, this is not an efficient PTAS. 

Now, we demonstrate correctness. Let x* be an optimal label assignment. By 
construction, Xj is the optimal assignment on Gj. Let Hj be the objective func¬ 
tion restricted to Gj. Since Xj consists of optimal solutions of each component 
of Gj, Hj{xj) > Hj{x*). 

Let dj = H{x*) — Hj{x*). So we have H{xj) > H{x*) — dj. Summing over 
all choices of j , 

fc-i fc-i 

j2H{x,)>Y,H{xn-d,{x*). 

j=0 j=0 

Each edge in G is missing from at most one Gj, so J^'jZo dj < H{x*). Thus, 

fe-i 

^ H{xj) > kH{x*) - H{x*) = fc(l - l/k)H(x*) = fc(l - e)H(x*). 

3=0 

Consequently, there exists some j where H{xj) > (1 — e)H{x*). □ 

5 Experiments 

The approximation scheme has relatively small constants, which suggested that 
it might be feasible to use in practice. We implemented a version of this PTAS 



in C++11 for tasks that arise in computer vision. For simplicity, we restricted 
our implementation to grid graphs, as is common in image processing. Optimal 
branch decompositions are particularly easy to find in this domain. 

5.1 Stereo Matching 

Given two images representing a left camera angle and a right camera angle and 
a number L of relative depth labels, we wish to assign a label in {1,... ,L} to 
each pixel in the, say, left image. In the computer vision community, these are 
often visualized as disparity maps, or grayscale images of the relative depths; see 
e.g. Figure We use the 16 label tsukuba example from the Middlebury stereo 
benchmark m for illustration here: 



Fig. 1: Tsukuba images from the Middlebury stereo benchmark. 


We used the following model as input to our algorithm. The graph G = {V, E) 
is the planar grid graph where each vertex represents a pixel. We define functions 


(t>u{i) = l3-\\u-u^ 


yueV 


'4’u,v{i,j) 


0 ii i = j 

l3-\\u-v\\l iii^j 


V {u, v) G E 


where u is a pixel in the left image, is the pixel that is i columns to the 
left of the pixel corresponding to u in the right image, |j • ||| is square 2-norm 
in CIELUV color space, and /3 is a constant sufficiently large to ensure that all 
outputs of the functions are positive. 

In addition to our basic algorithm, we also incorporate a few very simple 
vision-specific heuristics to refine our results. Initializing boundary pixels to the 
values from the previous (either left or right) connected component yields more 
visually continuous results. Since the analysis of the approximation holds for any 








value of the boundary pixels, in particular it holds for these values. Thus the 
approximation guarantee is preserved at this step. However, this results in some 
visual artifacts (see Figure]^. 



Fig. 2: Visual artifacts after one heuristic. 


To remedy this, we run the algorithm twice (intuitively, left-to-right and then 
right-to-left) and combine the solutions in an approximation-preserving way. 

Finally, a tiny amount of smoothing is done to remove remaining noise; this 
does not guarantee the approximation but leads to more visually-pleasing results. 

We use the evaluation tools provided on the Middlebury stereo website: 5.07% 
of all pixels are mislabeled including 3.02% of non-occluded regions and 11.5% 
of regions near depth discontinuities. Furthermore, as seen in Figure [4d] a large 
fraction of mislabeled pixels are concentrated in the bottom right; we believe 
that discrepancies between the MRF model and the ground truth explain this. 

State-of-the-art algorithms mislabel a little more than 1% of pixels including 
typically over 4% of regions near depth discontinuities. Many of the published 
algorithms on the Middlebury benchmark mislabel significantly more than 5.07% 
of all pixels, and the best algorithms involve optimizing dozens of hyperparam¬ 
eters and are highly specialized to their applications. 

We found that our generic PTAS required only a few basic heuristics to 
perform quite well, suggesting that with a few more heuristics, it could be very 
competitive. 


5.2 Observed e dependencies 

Experiments support the theoretical dependencies on e. Figures |3a] and 3b show 
the score and log running time, respectively, of our algorithm on the tsukuba 
image as a function of 1/e, using 14 labels and the learned parameters. The 
score changes remarkably little, considering the improvement in the theoretical 
bound. 

The running time matches the theory very closely. The observed ratios of 
running time as e increases from 2 to 5 are 23, 18.45, and 17.8; the theoretical 







(a) Score (in arbitrary units) as a function (b) Log of running time (in seconds) as a 
of 1/e. function of 1/e. Experiments were run on a 

mid-range 2014 laptop. 

Fig. 3 


run time is proportional to 1/e • which would predict ratios of 21, 18.67, 
and 17.5. 

5.3 OpenGM benchmark 

We used the OpenGM 2.3.3 [T3] library to benchmark the actual energy mini¬ 
mization performance of our algorithm compared to other existing methods. Our 
algorithm was run with e = 1/3. 

On the Inpainting benchmark, our algorithm achieves a score of 461.82, which 
is about 1.6% away from the best algorithm’s and better than half of the com¬ 
peting algorithms. On the Object Segmentation benchmark, we perform a bit 
worse; our score is about 64% away from the best and worse than most of the 
competition. 

6 Discussion Conclusions 

Our algorithm gives the first known PTAS for maximum a posteriori assignment 
on Pairwise MAP MRF, and the first EPTAS for this variant of Correlation 
Clustering in planar graphs. Combined with our hardness results, much of the 
complexity of Pairwise MAP MRF on planar graphs is now settled. While the 
algorithm is not directly competitive with the state of the art for computer vision 
tasks, it is sufficiently close to those algorithms to suggest applications in im¬ 
proving them, as well as in other applications which lack specialized algorithms. 

One can readily extend the given PTAS to more general classes of graphs, or 
(non-pairwise) MRFs in planar graphs with bounded factor degree. 

Compelling future research directions include studying Pairwise MAP MRF 
with with negative functions and two labels (but not necessarily submodular), 
and with more than two labels but submodular functions. 
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A Elided Proofs 


Theorem 1. There is ana > 0 such that, unless P = NP, there is no polynomial¬ 
time a-approximation algorithm for Pairwise MAP MRF, even for nonnega¬ 
tive (j) and Ip. 

Proof. Maximum Cut is NP-hard to approximate to better than a 60/61 fac¬ 
tor [22]. There is an approximation-preserving reduction from Max Cut to 
Pairwise MAP MRF, by setting ( pi { xi ) = 0 and ipij { xi , Xj ) to be 1 if ^ Xj 
and 0 otherwise. □ 

Theorem 2. The existence of an algorithm approximating Pairwise MAP 
MRF on planar graphs with maximum degree 4 o,nd nonpositive (pi and i/y 
to any multiplicative factor implies P = NP. 

Proof. Proof of this theorem is a modification of a proof of a weaker theorem by 
Wang P5] . 

Given a planar graph G, we construct an Pairwise MAP MRF instance 
which has a score of 0 if and only if G is 3-colorable. As planar 3-colorability is 
NP-complete even on planar graphs of maximum degree 4 [^, and an approx¬ 
imation algorithm to a multiplicative factor must find a solution of weight 0 if 
one exists, this implies the theorem. 

The Pairwise MAP MRF instance operates on G with L = 3 and functions 


(pi{x) = 0 

'>Pi,j{x,y) = 


0 if a; ^ y 

— 1 ii X = y 


Vx e {l,2,3},i e V, 
V (i,j) G E 


An assignment of score 0 is a 3-coloring where the labels are colors; the 
coloring is proper, as any edge with both endpoints of the same color would 
imply the value of the Pairwise MAP MRF instance is negative. Similarly, a 
3-coloring induces an assignment of score 0. □ 

Corollary 1. There is an EPTAS for Correlation Clustering in planar 
graphs. 

Proof. We present an approximation-preserving reduction from Correlation 
Clustering to Pairwise MAP MRF; with that. Theoremgives the result. 

Given an instance (G, w,p) of Correlation Clustering where G is planar, 
we construct an instance of Pairwise MAP MRF with the same graph, L = 4, 
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If X is an assignment to this Pairwise MAP MRF instance, we make a 
cluster out of each maximal connected subgraph with the same label. 

Edges with endpoints of different labels are exactly the edges between clus¬ 
ters, so the value of this Correlation Clustering solution is the same as the 
value of X. 

In the other direction, we contract each cluster of a given partition down 
into a single supervertex to yield graph G', which is also planar. By the 4-color 
theorem, there exists an assignment of the labels {1, 2,3,4} to the vertices of G' 
such that no adjacent vertices have the same label. 

Give each vertex in G the same label as the corresponding supervertex in G'. 
Edges within a cluster have both edges corresponding to the same supervertex, 
and thus they have the same label. Edges between clusters have corresponding 
edges in G', and thus have endpoints with different labels. Thus the value of the 
assignment is exactly the value of the partition. 

Both the creation of the corresponding Pairwise MAP MRP instance and 
the conversion of a solution of that instance to a solution of Correlation 
Clustering take time linear in the size of the input. Thus there is a linear 
time approximation-preserving reduction, which, in conjunction with Theorem]^ 
completes the proof. Note that while the PTAS for Pairwise MAP MRF is not 
an efficient PTAS, this one is, because L = 4 = 0(1). □ 


B Discussion of |23j 

Lemma 4.2 of [23] is critical to the correctness of Wang’s PTAS; as presented, it 
has some problems. 

The stated runtime does not account for the degree of the graph; fi has 
possible outputs if vertex Vi has degree d; all possible outputs must be examined 
to ensure correctness. 

Additionally, is defined to be the max-sum of the liberal functions 

attached to vertices of {U n Vt; ) \ (Xp. U SXp- ). In a nice tree decomposition of a 
star, that resulting set is empty for all i except the root r, which means that the 
entire value of is , in this case, is defined to be the sum of 

liberal functions attached to every vertex in the star when the configuration of 
just the root is fixed to be cr^y.. So, calculating Njf’y- is equivalent to solving 
the original problem and how it is calculated is not specified. 


C Additional Figures 



(c) Ground truth for comparison 


(d) Mislabeled pixels highlighted 


Fig. 4: Our results on tsukuba with heuristics applied. 








